Acknowledgements

The hrp2malaRia package was written by OJ Watson, a PhD Student with the Imperial College of London Mathematical Modeling group. This package was used in his recent Elife Publication, Watson OJ et al. 2017, Modelling the drivers of the spread of Plasmodium falciparum hrp2 gene deletions in sub-Saharan Africa. The ideas and much of theory that we will be discussing today was spear-headed by UNC Physician-Scientist, Jonathan Parr.

Additional thanks to Dayne Filer, Mike Fliss, and Sara Levintow for hacks.

Overview

Scenario

You are an Officer in the CDC’s elite Epidemic Intelligence Service (EIS) Malaria Branch and are called one day by a concerned ministry of health official in a Sub-Saharan Africa stating that their hospitals have seen several patients that appear to have malaria but had negative rapid diagnostic tests. They have read recent reports of hrp2 deletions and wonder if you can model this scenario and predict where they should be on the lookout for hrp2 deletions.

Using this document

  • R code chunks have a grey background
  • Outputs for all R code can be “toggled” to the right of all code chunks
  • While you can copy and paste this code into R, you will learn faster if you type out the commands yourself.
  • I have asked that you install several packages, including: remotes. tidyerverse, RColorBrewer, and cowplot. We will be using remotes to install some packages from Github as well.
  • Throughout the document, I have placed questions indicated by a line. As before, we suggest that you work on this with a partner, and answer the questions below (space is given, indicated by a line). But feel free to work on your own if you prefer. We will likely break frequently to discuss results.

Epidemiological & Biological Context

Malaria Burden

P. falciparum malaria is an infectious disease with a huge global burden, primarily concentrated in Africa. For an interactive map of P. falciparum prevalence, see the Malaria Atlas Project.

Malaria Life Cycle

The malaria life cycle is complex with several different stages. For the purposes of this exercise, it is only important to note that the malaria parasite is transmitted by a mosquito vector that infects the human-host.

Malaria Detection

In clinical and field settings, malaria is often diagnosed with either the use of light microscopy or rapid diagnostic tests (RDTs). In both cases, a finger-prick of blood is taken from patients and used to screen for malaria. Confirmation of infection by microscopy is made with direct visualization of the parasite by an expert microscopist while RDTs provide a “yes/no” result and can be performed by relatively anyone.

RDTs detect parasite antigens (e.g. a parasite-specific marker) from the patient’s peripheral blood. In the case of the P. falciparum, the vase majority of Pf-RDTs detect a P. falciparum specific protein, histidine-rich protein 2 (hrp2).

However, recent evidence has accumulated to suggest that P. falciparum parasites have undergone selection for hrp2-gene deletions, making them no longer detectable by many RDTs (“stealth parasites”). In theory, by avoiding detection, parasites can then remain untreated and have a higher likelihood of spreading.

Note, for the purposes of our discussions, we will ignore differences between the lower-limits of detection (and sensitivity/specificity) of these two diagnostic modalities.

Prevalence & Multiplicity of Infection

In regions of high malaria prevalence, it is common for individuals to experience numerous innocuous bites every single evening. For example, in the Democratic Republic of the Congo, it is estimated that the average individual experiences 60-400 infectious mosquito bites/person/year*. As a result of this intense burden, individuals can be infected with numerous different strains of malaria (termed “superinfection” – note, for the purpose of this thought exercise we will ignore “co-transmission” events).

  • The number of infectious mosquito bites per person per year is often termed the entomological inoculation rate (EIR) and is a measure of transmission intensity.

Exercise Objectives

In this lab, we are going to explore different scenarios where hrp2 deletions are expected to increase in frequency (e.g. be “selected for”) or decrease in frequency (e.g. be “selected against”). We will also explore the consequences of these increases through an epidemiological lens.

Hypothesis Generating

Before we began, discuss the following points with your neighbors (or as a group):

  1. If you were asked to a design an antigen-based detection method, what type of antigen would you pick (e.g. from an evolutionary point of view, what type of proteins/markers would be best)? _________________________________________________________

  2. Do you expect for parasite prevalence, and particularly superinfections, to affect the prevalence of hrp2 deletions?
    • Why (or Why not)? _____________________________
    • If you have multiple infections, what may be required of all them to go undetected? _______________________
  3. How can “silent” phenotypes be produced?
    • What are ways for a protein (e.g. a product of a gene) to be “removed”? _____________________
    • Would you expect a “fitness” cost be greater for a large deletion, a small deletion, or a silencing point mutation? _______________________________________

Model Specification

In the previous exercise led by Dr. Nunn, you first used a deterministic SIR model to analyze the spread of a “childhood disease” through a susceptible population. You then extended your deterministic SIR model to include fluctuations in the population size through births and deaths (e.g. a demographic model). Next, you modeled the uncertainty and the variation in your model through simulation using a stochastic approach. As a reminder, stochastic individual-based models essentially follow individuals through the different compartments and record their states/times as they move through the model. This explicitly models individual-level heterogeneity and allows for several possible extension/fine-level manipulations (e.g. see network stochastic models, etc.) In short, a deterministic model is a “closed-system” where we will arrive at the same answer every time (given the same parameters, initial conditions, etc). In contrast, a stochastic model let’s us simulate many possible iterations of our model and directly express the randomness (or really uncertainty) in our model parameters and outcome.

In this exercise, we are going to use an individual-based, stochastic malaria transmission model that is an extension of the “Imperial Model” (see OJ’s Material & Methods) for a full description via Griffin et al. 2014, 2015, 2016) OJ describes the Imperial Model fully.

Refer to Figure 5 from OJ’s manuscript. What kind of Compartment Model is the “hrp2-Imperial Model”? ______________________

Note, for the purpose of this exercise, we are going to ignore some of the (beautiful) complexities of the model and instead talk about this model in a Susceptible-Infected-Recovered-Susceptible framework (i.e. the U,A,D compartments are collapsed into the “Infected” category and Recovered corresponds to the period of active treatment/prophylaxis – the T,P compartments – when individuals are “immune”). What is another scenario that SIRS models can be applied to (hint: think about your tetanus vaccines)? __________________
We are going to be focusing on a few adaptations of “hrp2-Imperial Model”, namely:

  1. The addition of the parameter, \(f\epsilon_T\), or “the probability that the clinical case yields a positive diagnostic result and subsequently receives treatment”. Essentially, this parameter accounts for our “belief” in the RDTs by saying “given that you have clinical disease, will you go get treatment based on your RDT result”*.
  2. The contribution of hrp2 deleted versus wildtype strains to the mosquito infectious reservoir (i.e. \(C_D\), \(C_U\), \(C_T\)).
  3. The fitness of the hrp2 deleted versus wildtype strains. This parameter is modeled directly into the probability that a mosquito can transmit an hrp2 deleted parasite. Put simply, this essentially follows the classical definition of fitness, as a probability that a given “gene” (here an hrp2 gene deletion) is passed on in subsequent generations.

For each of the parameters listed above, which SIRS compartment will be most affected? ________________

* Note, for the purpose of this exercise, we will ignore cross-reactivity of the hrp3 epitope by setting \(\epsilon\) to 0 and set RDT adherence to 100%. Put simply, this means that if an individual only contains hrp2 deleted parasites, they will not receive treatment (but if they contain any wildtype parasites, they may receive treatment depending on their probability of seeking treatment).

Variables of Interest

This is copied from the README that accompanies the hrp2malaria package instructions. Please note, this is a subset of potential input variables to change and the output variables of interest.

Input Variables to Change

Variable Description
years Number of years simulation is run for. 20 years will be usually long enough to see hrp2 selection at low EIRs.
N Population size. The model is stochastic and so a larger N will enable clearer patterns to be found in model outputs, however, they will take longer to run. Population of 2000 should be a good balance of speed and clarity for demonstration purposes. For the analysis in the eLife modelling paper N was set to 100,000
strains.0 Starting ratio of strains in the population. This number must be an integer. The default is set to 10, which will mean that on initialization there will be roughly 10 times as many wild type strains as there are hrp2 deleted.
EIR Annual EIR, provided as a fraction, i.e. the default EIR of 20/365 represents an EIR of 20.
ft Frequency of treatment being sought upon developing clinical disease. Must be a number between 0 and 1, with the default being 0.4, i.e. 40% of cases seek treatment.
rdt.det The probability that an individual only infected with hrp2-deleted strains still produces a positive RDT. The default is 1, i.e. no selective pressure. For the eLife paper we used a value of 0.25 that represented the approximate 25% chance that a positive RDT would still be produced due to hrp3 protein epitope cross reactivity
rdt.nonadherence The probability that the result of an RDT is ignored, i.e. a negative RDT due to hrp2 deletions will still be treated. Default = 0, and in the eLife paper we used a value of 10% based on data collected by the WHO.
microscopy.use The proportion of cases that are tested by microscopy rather than RDT. Default = 0, and in the eLife paper we tested the impact of 30% of cases being diagnosed by microscopy, representing the ~70% of cases diagnosed being diagnosed by RDTs in 2014.
fitness Comparative fitness of hrp2 deleted parasites. Default = 1, which mean the wild type is equally as likely to be passed on to mosquitoes as an hrp2 deleted parasite. 0.8 results in the hrp2 deleted strain being 80% likely to be passed on, whereas a value of 1.2 would mean it is 20% more likely.

Output Variables of Interest

Variable Description
S.Times The timing of when the simulated population is sampled, i.e. the time at which all other series variables (variables that begin S.) were collected
S.Prev.All The prevalence of malaria within the whole population
S.Prev.05 The prevalence of malaria within under 5s
S.N.Dels The proportion of all parasites that are hrp2 deleted
S.N.Dels.05 The proportion of all parasites within under 5s that are hrp2 deleted
S.Incidence The daily clinical incidence of malaria. Multiply this by 365 to have the expected number of clinical cases of malaria a person within the whole population is expected to have in a year
S.Incidence.05 The daily clinical incidence of malaria. Multiply this by 365 to have the expected number of clinical cases of malaria a child under the age of 5 is expected to have in a year
S.Prev.Mono.D The proportion of infected individuals who are only infected with hrp2 deleted parasites
S.Prev.Mono.D.05 The proportion of infected individuals under the age of 5 who are only infected with hrp2 deleted parasites

Our Simulated World

For the purpose of the exercise, we are going to make some more simplifying assumptions. As noted above, we will set the models so that if an individual only contains hrp2 deleted parasites, they will not have a positive RDT, and therefore, will not receive treatment (but if they contain any wildtype parasites, they may receive treatment depending on their probability of seeking treatment). We are also going to set our populations and time steps to 2,000 individuals and 20 years, respectively.
However, we will vary the diagnostic modalities that populations use, the transmission intensity (EIR), and the fitness of the parasites.

Let’s Get Started

Installation

First you will need to make sure that you have the remotes packaged installed. This is a wonderful package that allows for R developers to install code from various repositories as a corner-stone of reproducible research. If you don’t already have this package, type install.packages("remotes") into your console (quick download). We will now download the hrp2malaria package that is the foundation of this lab.

# check to make sure user has the needed packages; if missing, install for them 
list.of.packages <- c("tidyverse", "remotes", "cowplot", "RColorBrewer")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)


remotes::install_github("OJWatson/hrp2malaRia")
remotes::install_github("nickbrazeau/EMSI2019hrp2lab") # hacky way to download this tutorial as a package
remotes::install_github("DavisVaughan/furrr") # we are going to use this for speed later
library(hrp2malaRia)
library(EMSI2019hrp2lab)
library(furrr)
library(tidyverse)
library(RColorBrewer)
library(cowplot)

set.seed(42) # please don't run this line! 

Scenario 1: Different Diagnostics

Imagine that there are three populations with a constant EIR of 1 (reminder, EIR is the entomological inoculation rate, or the number of infectious mosquito bites per person per year). In addition, all populations start with approximately 10% frequency of hrp2 deletions and hrp2 deleted and wildtype parasites have equal fitness. However, the populations differ by:

Population 1:

  • Only uses rapid diagnostic tests (RDTs) to diagnose malaria

Population 2:

  • Uses RDTs to diagnose 50% of cases and microscopy for the other 50%

Population 3:

  • Only uses microscopy to diagnose malaria

Output

As noted above, the hrp2_Simulation function returns many results that allow for many different questions to be investigated. For the purpose of our exercise, we will focus on the increase of hrp2 frequency (df$S.N.Dels) over time (df$S.Times) as well as incidence (df$S.Incidence) over time (df$S.Times)

Let’s run the code (once) and plot the result:

# Simulations take ~1 minute to run on my MacBook Pro (3.1 GHz Intel Core i7)
# Population 1

# as a side note, let's track how long this takes
start_time <- Sys.time()

senario1.pop1 <- hrp2malaRia::hrp2_Simulation(N=2000, 
                                                  years=20, 
                                                  strains.0 = 10,
                                                  rdt.nonadherence = 0, 
                                                  EIR=1/365, 
                                                  microscopy.use = 0, 
                                                  rdt.det = 0, 
                                                  fitness = 1
                                                  )
## [1] 2.000000000 0.938027691 0.017019987 0.005870036 1.871794872 0.000000000
## [1] 7.370000e+02 9.445417e-01 1.195698e-02 4.419006e-03 4.643766e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.373667e-01 1.690096e-02 6.650007e-03 4.614412e-01
## [6] 2.147059e+01
## [1] 2.207000e+03 9.318295e-01 2.100431e-02 8.083852e-03 4.679487e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.251919e-01 2.610919e-02 9.616674e-03 4.828042e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.239892e-01 2.737176e-02 9.556782e-03 1.442688e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.339786e-01 1.974521e-02 7.193909e-03 2.462888e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.286655e-01 2.372429e-02 8.527926e-03 4.691517e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.355400e-01 1.854760e-02 6.830082e-03 3.753213e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.340496e-01 1.967587e-02 7.192206e-03 4.697555e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  38.512 secs"
# Population 2
senario1.pop2 <- hrp2malaRia::hrp2_Simulation(N=2000, 
                                                  years=20, 
                                                  strains.0 = 10,
                                                  rdt.nonadherence = 0, 
                                                  EIR=1/365, 
                                                  microscopy.use = 0.5, 
                                                  rdt.det = 0, 
                                                  fitness = 1
                                                  )
## [1] 2.000000000 0.937746094 0.017301584 0.005870036 1.402048656 0.000000000
## [1] 7.370000e+02 9.438160e-01 1.253108e-02 4.570681e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.448219e-01 1.191571e-02 4.180118e-03 9.125000e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.470266e-01 1.043272e-02 3.458366e-03 1.363636e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.425597e-01 1.360800e-02 4.750007e-03 9.034653e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.486319e-01 8.828297e-03 3.457527e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.496720e-01 8.133835e-03 3.111833e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.381591e-01 1.683134e-02 5.927295e-03 9.358974e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.438357e-01 1.261418e-02 4.467828e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.426647e-01 1.347364e-02 4.779366e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  37.395 secs"
# Population 3
senario1.pop3 <-  hrp2malaRia::hrp2_Simulation(N=2000, 
                                                   years=20, 
                                                   strains.0 = 10,
                                                   rdt.nonadherence = 0, 
                                                   EIR=1/365, 
                                                   microscopy.use = 1, 
                                                   rdt.det = 0, 
                                                   fitness = 1
                                                   )
## [1]  2.000000000  0.938076341  0.016971337  0.005870036  0.472186287
## [6] 52.142857143
## [1] 737.00000000   0.93667950   0.01817712   0.00606110   0.47096774
## [6]   0.00000000
## [1] 1.472000e+03 9.362278e-01 1.792906e-02 6.760827e-03 9.299363e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.345923e-01 1.964799e-02 6.677386e-03 2.324841e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.445505e-01 1.184586e-02 4.521359e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.406812e-01 1.489725e-02 5.339300e-03 9.113608e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.396121e-01 1.552303e-02 5.782612e-03 4.562500e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.460088e-01 1.110151e-02 3.807386e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.477792e-01 9.845482e-03 3.293024e-03 4.511743e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.484229e-01 9.201337e-03 3.293510e-03 9.034653e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  37.314 secs"
end_time <- Sys.time()

sequentialtime <- end_time - start_time

# Now, let's plot these - all veriables that begin with S. are series variables collected over time. 
# let's plot the first population
plot(senario1.pop1$S.Times,senario1.pop1$S.N.Dels, xlab = "time (days)", ylab = "hrp2 deletion Frequency", ylim=c(0,1), col="red", type="l", lwd=2)
# and add the second population
lines(senario1.pop2$S.Times, senario1.pop2$S.N.Dels,lwd=2, col="blue")
# and add the third population
lines(senario1.pop3$S.Times, senario1.pop3$S.N.Dels,lwd=2, col="green")
# and add our legend
legend(2000, .25, legend=c("Only RDT use", "50% RDT use", "0% RDT Use"),
       col=c("red", "blue", "green"), lty=1, cex=0.8,
       box.lty=2, box.lwd=2)

Advanced Content

Given that this is a stochastic model, there will be run-to-run variability of each model iteration. This variability is important in quantifying our uncertainty of the stochastic process that we are modeling.
To get multiple runs of the model, you can simply copy and paste the code above (and rename the output objects [e.g. r1.2], as to not overwrite your previous results). Or, we can make a map dataframe, where we specify the different parameters we want for each run and use this dataframe as the “captain” calling the “plays”.

In this map dataframe, our parameter names will be columns and our parameter options will be cells with each row being its own respective model. Using a map file is a cornerstone of functional programming and a high-yield concept for reproducible research. Plus, as Dr. Nunn noted earlier, stochastic models are typically slower to run than deterministic models. As a result, we now have a need for speed, which is made easier with a map dataframe that we can “loop” through.

Why do you think stochastic models are typically slower to run than deterministic models? _____________________

More Advanced Content

We are going to use the furrr package to take advantage of the multiple processors on your computer (e.g. parallelize the simulations). This is simply for speed. If you are having trouble on your personal computer, you can change the furrr::future_pmap commands below to purrr::pmap and will get the same result (albeit at a slower pace).

Let’s try it out below:

hrp2map <- data.frame(run = c(1,1,1, 2,2,2, 3,3,3), # note you could use rep here (this is for clarity)
                      N=2000, 
                      years=20, 
                      strains.0 = 10,
                      rdt.nonadherence = 0, 
                      EIR=1/365, 
                      microscopy.use = c(0,0.5,1, 0,0.5,1, 0,0.5,1),
                      rdt.det = 0,
                      fitness = 1)

Let’s look at what we just made:

dplyr::glimpse(hrp2map)
## Observations: 9
## Variables: 9
## $ run              <dbl> 1, 1, 1, 2, 2, 2, 3, 3, 3
## $ N                <dbl> 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000,…
## $ years            <dbl> 20, 20, 20, 20, 20, 20, 20, 20, 20
## $ strains.0        <dbl> 10, 10, 10, 10, 10, 10, 10, 10, 10
## $ rdt.nonadherence <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ EIR              <dbl> 0.002739726, 0.002739726, 0.002739726, 0.002739…
## $ microscopy.use   <dbl> 0.0, 0.5, 1.0, 0.0, 0.5, 1.0, 0.0, 0.5, 1.0
## $ rdt.det          <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ fitness          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1

OK, let’s run our models and store them in our map dataframe.

# note, we have to tell R that scenario and populations aren't parameters that hrp2_Simulation will accept
future::plan(multiprocess)

start_time <- Sys.time()

hrp2map$results <- furrr::future_pmap(hrp2map[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")], 
            hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.937789037 0.017258642 0.005870036 0.481530343 0.000000000
## [1] 7.370000e+02 9.285488e-01 2.404352e-02 8.325436e-03 4.815303e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.360696e-01 1.839260e-02 6.455492e-03 4.703608e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.349764e-01 1.908321e-02 6.858071e-03 9.772423e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.296419e-01 2.254842e-02 8.727421e-03 1.473755e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.302527e-01 2.270138e-02 7.963609e-03 1.465863e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.260954e-01 2.578641e-02 9.035866e-03 9.681698e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.287237e-01 2.350859e-02 8.685455e-03 9.505208e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.267969e-01 2.503800e-02 9.082796e-03 2.327806e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.229371e-01 2.795968e-02 1.002092e-02 2.315990e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  73.558 secs"
## [1] 2.000000000 0.938093674 0.016954004 0.005870036 0.486018642 0.000000000
## [1] 7.370000e+02 9.416223e-01 1.431508e-02 4.980332e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.387285e-01 1.617671e-02 6.012498e-03 1.975643e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.439633e-01 1.266424e-02 4.290140e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.421197e-01 1.400115e-02 4.796855e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.428770e-01 1.324656e-02 4.794123e-03 4.758801e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.465764e-01 1.039426e-02 3.947086e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.479050e-01 9.387148e-03 3.625567e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.477657e-01 9.504744e-03 3.647261e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.523790e-01 6.297164e-03 2.241597e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  71.236 secs"
## [1] 2.000000000 0.937993009 0.017054669 0.005870036 0.922882427 0.000000000
## [1] 737.00000000   0.94649912   0.01050683   0.00391176   0.00000000
## [6]   0.00000000
## [1] 1.472000e+03 9.414193e-01 1.424447e-02 5.253974e-03 1.389594e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.402833e-01 1.498247e-02 5.651974e-03 9.090909e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.405978e-01 1.478139e-02 5.538530e-03 1.384324e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.411119e-01 1.434713e-02 5.458687e-03 1.382576e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.362304e-01 1.841741e-02 6.269895e-03 9.136421e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.420556e-01 1.379100e-02 5.071160e-03 9.136421e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.372267e-01 1.781254e-02 5.878466e-03 9.147870e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.339309e-01 1.991022e-02 7.076595e-03 9.182390e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  72.807 secs"
## [1] 2.000000000 0.938113298 0.016934380 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.446954e-01 1.191023e-02 4.312038e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.436102e-01 1.276107e-02 4.546453e-03 4.840849e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.436312e-01 1.253966e-02 4.746825e-03 9.530026e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.381188e-01 1.659446e-02 6.204419e-03 9.605263e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.324246e-01 2.068535e-02 7.807735e-03 4.853723e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.389675e-01 1.589635e-02 6.053833e-03 5.041436e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.380280e-01 1.717206e-02 5.717645e-03 9.746328e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.320837e-01 2.134408e-02 7.489918e-03 9.517601e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.347836e-01 1.896892e-02 7.165152e-03 9.383033e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  72.955 secs"
## [1] 2.000000000 0.937916177 0.017131501 0.005870036 1.843434343 0.000000000
## [1] 7.370000e+02 9.394610e-01 1.579236e-02 5.664325e-03 9.335038e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.398447e-01 1.535750e-02 5.715503e-03 4.727979e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.387940e-01 1.626389e-02 5.859834e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.418029e-01 1.406254e-02 5.052293e-03 9.592641e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.408651e-01 1.498898e-02 5.063675e-03 2.410832e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.420547e-01 1.387456e-02 4.988497e-03 1.463904e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.498220e-01 8.098954e-03 2.996793e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.491058e-01 8.868538e-03 2.943401e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.494329e-01 8.432191e-03 3.052619e-03 4.655612e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  72.018 secs"
## [1] 2.000000000 0.937806548 0.017241131 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.266840e-01 2.507510e-02 9.158619e-03 2.298489e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.391882e-01 1.568970e-02 6.039765e-03 9.205549e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.415743e-01 1.425392e-02 5.089454e-03 9.443726e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.438571e-01 1.247399e-02 4.586656e-03 9.079602e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.466662e-01 1.043451e-02 3.817034e-03 9.275731e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.444941e-01 1.226461e-02 4.158978e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.501976e-01 7.707313e-03 3.012767e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.509242e-01 7.250067e-03 2.743465e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.489112e-01 8.633443e-03 3.373096e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  36.348 secs"
## [1] 2.000000000 0.937802078 0.017245600 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.326059e-01 2.066730e-02 7.644533e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.295670e-01 2.301951e-02 8.331239e-03 1.862245e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.388112e-01 1.655598e-02 5.550485e-03 9.299363e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.329360e-01 2.053707e-02 7.444623e-03 1.372180e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.356599e-01 1.872161e-02 6.536165e-03 4.562500e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.393568e-01 1.579163e-02 5.769308e-03 9.263959e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.380146e-01 1.693450e-02 5.968573e-03 4.661558e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.307728e-01 2.218504e-02 7.959909e-03 2.281250e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.261455e-01 2.553316e-02 9.239033e-03 8.968059e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  72.803 secs"
## [1] 2.000000000 0.938109335 0.016938344 0.005870036 1.477732794 0.000000000
## [1] 7.370000e+02 9.462491e-01 1.058312e-02 4.085539e-03 9.811828e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.475303e-01 9.878088e-03 3.509295e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.462744e-01 1.083537e-02 3.807893e-03 1.431373e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.488855e-01 8.807959e-03 3.224259e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.514765e-01 6.871997e-03 2.569170e-03 4.721863e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.516637e-01 6.877652e-03 2.376373e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.537310e-01 5.195849e-03 1.990863e-03 4.579674e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.542535e-01 4.839139e-03 1.825064e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.552773e-01 4.172819e-03 1.467588e-03 4.631980e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  70.323 secs"
## [1] 2.000000000 0.938139311 0.016908367 0.005870036 1.363636364 0.000000000
## [1] 7.370000e+02 9.436287e-01 1.274357e-02 4.545454e-03 9.431525e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.473742e-01 9.986031e-03 3.557464e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.486448e-01 9.078603e-03 3.194313e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.510047e-01 7.225494e-03 2.687548e-03 4.691517e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.476797e-01 9.722557e-03 3.515493e-03 4.815303e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.429700e-01 1.305331e-02 4.894417e-03 4.777487e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.385152e-01 1.629715e-02 6.105325e-03 1.450331e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.378777e-01 1.687732e-02 6.162723e-03 4.802632e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.393386e-01 1.571528e-02 5.863851e-03 1.458056e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  71.362 secs"
end_time <- Sys.time()
paralleltime <- end_time -  start_time

Just to prove a point, check out how much faster your code ran in parallel versus sequentially:

sequentialtime
## Time difference of 1.888728 mins
paralleltime
## Time difference of 2.062363 mins

Now let’s visualize our plots of the proportion of hrp2 deleted parasites over time.

# note, I wrote this function for you, it returns a ggplot
plot.scenario1.run1 <- plothrp2models(pop1 = hrp2map$results[[1]],
                                      pop2 = hrp2map$results[[2]],
                                      pop3 = hrp2map$results[[3]],
                                      labels = c("Only RDT use", "50% RDT use", "0% RDT use")) # you need to manually add the labels

# you can manually add a title like this too 
plot.scenario1.run1 <- plot.scenario1.run1 + ggtitle("hrp2 Simulation Scenario 1, Run 1")


# now consider run 2
plot.scenario1.run2 <- plothrp2models(pop1 = hrp2map$results[[4]],
                                      pop2 = hrp2map$results[[5]],
                                      pop3 = hrp2map$results[[6]],
                                      labels = c("Only RDT use", "50% RDT use", "0% RDT use"))
plot.scenario1.run2 <- plot.scenario1.run2 + ggtitle("hrp2 Simulation Scenario 1, Run 2")

# now consider run 3
plot.scenario1.run3 <- plothrp2models(pop1 = hrp2map$results[[7]],
                                      pop2 = hrp2map$results[[8]],
                                      pop3 = hrp2map$results[[9]],
                                      labels = c("Only RDT use", "50% RDT use", "0% RDT use"))
                                      
plot.scenario1.run3 <- plot.scenario1.run3 + ggtitle("hrp2 Simulation Scenario 1, Run 3")

# Plot them together
cowplot::plot_grid(plot.scenario1.run1, 
                   plot.scenario1.run2,
                   plot.scenario1.run3,
                   ncol=1)

Questions for Simulation 1 with regards to hrp2 frequency

  • How does the difference in diagnostic modality affect the proportion of hrp2 deletions? __________________
  • How much variation did you appreciate between runs? __________________________________

  • Was there anything unusual about your plots? Why was it unusual? _______________________________

Try running the models for a sample size of 100,000 (like the eLife paper). Note, this will take some time but the larger N should elicit a clearer/more consistent pattern.

Scenario 2: Different Fitness

Again, imagine that there are three populations with a constant EIR of 1 (reminder, EIR is the entomological inoculation rate, or the number of infectious mosquito bites per person per year). In addition, all populations start with approximately 10% frequency of hrp2 deletions and all populations only use RDTs for diagnostics. However, the populations differ by:

Population 1:

  • Fitness is equal between hrp2-deleted and wildtype parasites

Population 2:

  • Fitness cost of hrp2-deletion is approximately 0.5 compared to wildtype parasites

Population 3:

  • Fitness benefit of hrp2-deletion is 1.5 compared to wildtype parasites
# going to overwrite our old hrp2map and make a new one

hrp2map <- data.frame(
                      run = c(1,1,1,2,2,2),
                      N=2000, 
                      years=20, 
                      strains.0 = 10,
                      rdt.nonadherence = 0, 
                      EIR=1/365, 
                      microscopy.use = 0,
                      rdt.det = 0,
                      fitness = c(0.5, 1, 1.5, 0.5, 1, 1.5)
                      )


hrp2map$results <- furrr::future_pmap(hrp2map[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")], 
            hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.937880420 0.017167259 0.005870036 0.475880052 0.000000000
## [1] 7.370000e+02 9.358669e-01 1.852343e-02 6.527393e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.355546e-01 1.847213e-02 6.890933e-03 9.383033e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.395899e-01 1.582276e-02 5.505091e-03 1.393130e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.448582e-01 1.190821e-02 4.151340e-03 1.379093e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.455246e-01 1.093836e-02 4.454800e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.448198e-01 1.210325e-02 3.994618e-03 4.626109e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.454009e-01 1.133661e-02 4.180228e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.407107e-01 1.491021e-02 5.296794e-03 4.451220e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.402042e-01 1.539070e-02 5.322835e-03 9.090909e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  58.379 secs"
## [1] 2.000000000 0.937675888 0.017371790 0.005870036 0.463786531 0.000000000
## [1] 7.370000e+02 9.305199e-01 2.247127e-02 7.926524e-03 4.473039e-01
## [6] 2.807692e+01
## [1] 1.472000e+03 9.323406e-01 2.101782e-02 7.559327e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.308617e-01 2.205894e-02 7.997089e-03 1.367041e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.297003e-01 2.281160e-02 8.405776e-03 4.596977e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.252118e-01 2.633644e-02 9.369431e-03 2.324841e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.252974e-01 2.602692e-02 9.593349e-03 9.468223e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.280622e-01 2.389288e-02 8.962687e-03 1.425781e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.312428e-01 2.182487e-02 7.850065e-03 9.554974e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.289003e-01 2.366435e-02 8.353103e-03 1.901042e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  58.982 secs"
## [1] 2.000000000 0.937896625 0.017151054 0.005870036 1.423927178 0.000000000
## [1] 7.370000e+02 9.460667e-01 1.075123e-02 4.099826e-03 2.385621e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.344377e-01 1.966164e-02 6.818419e-03 2.404480e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.358108e-01 1.839600e-02 6.710885e-03 4.945799e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.345697e-01 1.933488e-02 7.013175e-03 1.422078e+00
## [6] 2.281250e+01
## [1] 3.677000e+03 9.387914e-01 1.618095e-02 5.945351e-03 4.771242e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.381058e-01 1.677199e-02 6.039953e-03 9.182390e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.345114e-01 1.923666e-02 7.169698e-03 1.375628e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.300891e-01 2.260855e-02 8.220078e-03 4.697555e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.290805e-01 2.359091e-02 8.246274e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  58.321 secs"
## [1] 2.000000000 0.938131085 0.016916593 0.005870036 0.905707196 0.000000000
## [1] 7.370000e+02 9.441675e-01 1.214409e-02 4.606153e-03 4.579674e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.467722e-01 1.019846e-02 3.947062e-03 9.090909e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.459101e-01 1.089988e-02 4.107750e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.528423e-01 5.912049e-03 2.163331e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.540192e-01 5.074824e-03 1.823720e-03 4.445798e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.513227e-01 7.044500e-03 2.550489e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.499063e-01 7.975236e-03 3.036197e-03 8.869988e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.534952e-01 5.479154e-03 1.943340e-03 4.434994e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.517694e-01 6.674433e-03 2.473871e-03 9.023486e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  56.522 secs"
## [1]  2.000000000  0.937770584  0.017277094  0.005870036  1.852791878
## [6] 33.181818182
## [1] 7.370000e+02 9.423227e-01 1.377049e-02 4.824532e-03 4.740260e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.425903e-01 1.340750e-02 4.919912e-03 4.631980e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.353645e-01 1.898311e-02 6.570116e-03 2.278402e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.377276e-01 1.685320e-02 6.336939e-03 1.857506e+00
## [6] 3.318182e+01
## [1] 3.677000e+03 9.369633e-01 1.744207e-02 6.512304e-03 1.402049e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.313053e-01 2.157479e-02 8.037659e-03 9.383033e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.348300e-01 1.897195e-02 7.115783e-03 1.896104e+00
## [6] 2.027778e+01
## [1] 5.882000e+03 9.356115e-01 1.829820e-02 7.008022e-03 4.796321e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.342203e-01 1.954126e-02 7.156189e-03 1.869398e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  58.588 secs"
## [1] 2.000000000 0.938118728 0.016928951 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.367177e-01 1.792428e-02 6.275747e-03 1.423927e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.370264e-01 1.724262e-02 6.648713e-03 1.425781e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.346493e-01 1.931133e-02 6.957062e-03 1.420233e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.372285e-01 1.719770e-02 6.491561e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.363968e-01 1.791647e-02 6.604464e-03 9.580052e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.378270e-01 1.695242e-02 6.138308e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.396067e-01 1.554566e-02 5.765330e-03 1.427640e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.395068e-01 1.557861e-02 5.832318e-03 1.420233e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.339625e-01 1.966110e-02 7.294067e-03 4.752604e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  58.686 secs"
plot.scenario2.run1 <- plothrp2models(pop1 = hrp2map$results[[1]],
                                      pop2 = hrp2map$results[[2]],
                                      pop3 = hrp2map$results[[3]],
                                      labels = c("hrp2 Fitness 0.5", "hrp2 Fitness 1", "hrp2 Fitness 1.5")) 
plot.scenario2.run1 <- plot.scenario2.run1 + ggtitle("hrp2 Simulation Scenario 2, Run 1")


plot.scenario2.run2 <- plothrp2models(pop1 = hrp2map$results[[4]],
                                      pop2 = hrp2map$results[[5]],
                                      pop3 = hrp2map$results[[6]],
                                      labels = c("hrp2 Fitness 0.5", "hrp2 Fitness 1", "hrp2 Fitness 1.5")) 
plot.scenario2.run2 <- plot.scenario2.run2 + ggtitle("hrp2 Simulation Scenario 2, Run 2")

cowplot::plot_grid(plot.scenario2.run1, plot.scenario2.run2,ncol=1)

Questions for Simulation 2 with regards to hrp2 frequency

  • How do the differences fitness affect hrp2 frequency? __________________
  • Were these effects greater than the effects of the diagnostic modality (above)? __________________________________
    • Why (or why not) do you think that may be the case? ___________________________
    • What are the differences in a selective pressure vs. fitness? _________________________
  • Can a parasite that is less fit still propagate through the population (call forward to Dr. Mitchell/Dr. Nunn’s lecture) ? _______________________________

Scenario 3: Different EIR

Again, imagine that there are three populations where all populations start with approximately 10% frequency of hrp2 deletions, hrp2-deleted and wildtype strains have the same fitness, and all populations only use RDTs for diagnostics. However, the populations differ by:

Population 1:

  • EIR is 1

Population 2:

  • EIR is 10

Population 3:

  • EIR is 100
# going to overwrite our old hrp2map and make a new one

hrp2map <- data.frame(scenario = "scnr3",
                      population = c("pop1", "pop2", "pop3", "pop1", "pop2", "pop3"),  
                      run = c(1,1,1,2,2,2),
                      N=2000, 
                      years=20, 
                      strains.0 = 10,
                      rdt.nonadherence = 0, 
                      EIR=c(1/365, 10/365, 100/365, 1/365, 10/365, 100/365),
                      microscopy.use = 0,
                      rdt.det = 0,
                      fitness = 1
                      )


hrp2map$results <- furrr::future_pmap(hrp2map[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")], 
           hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.938062650 0.016985028 0.005870036 0.466751918 0.000000000
## [1] 7.370000e+02 9.332143e-01 2.049986e-02 7.203522e-03 4.667519e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.378587e-01 1.696339e-02 6.095624e-03 1.438896e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.338092e-01 1.993720e-02 7.171302e-03 1.437008e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.276029e-01 2.446566e-02 8.849182e-03 4.740260e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.290152e-01 2.327101e-02 8.631510e-03 4.765013e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.265831e-01 2.495267e-02 9.381969e-03 9.346991e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.244456e-01 2.669770e-02 9.774449e-03 1.848101e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.286924e-01 2.359720e-02 8.628153e-03 4.703608e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.308742e-01 2.212872e-02 7.914828e-03 1.407455e+00
## [6] 3.650000e+01
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  58.583 secs"
## [1]  2.00000000  4.73924343  0.18200596  0.05870036 10.08552632  0.00000000
## [1] 737.00000000   4.72204553   0.18911037   0.06879384   7.11963589
## [6]  33.18181818
## [1] 1.472000e+03 4.723458e+00 1.869436e-01 6.954863e-02 9.903101e+00
## [6] 4.562500e+01
## [1] 2.207000e+03 4.728257e+00 1.854155e-01 6.627773e-02 1.030809e+01
## [6] 3.318182e+01
## [1] 2.942000e+03 4.724901e+00 1.864322e-01 6.861657e-02 1.106061e+01
## [6] 0.000000e+00
## [1] 3677.0000000    4.7372785    0.1777422    0.0649290    8.1716418
## [6]   20.2777778
## [1] 4.412000e+03 4.716510e+00 1.935651e-01 6.987458e-02 7.785445e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 4.718370e+00 1.920891e-01 6.949051e-02 7.318296e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 4.720333e+00 1.906057e-01 6.901117e-02 1.017744e+01
## [6] 2.807692e+01
## [1] 6.617000e+03 4.727345e+00 1.839619e-01 6.864267e-02 8.212500e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  61.435 secs"
## [1]   2.0000000  42.6553671   1.9278993   0.5870036  72.1400524 104.2857143
## [1] 737.0000000  42.4551214   1.9934233   0.7217252  76.3618421  78.2142857
## [1] 1472.0000000   42.4658225    1.9835096    0.7209379   79.1392904
## [6]   78.2142857
## [1] 2207.0000000   42.4187713    2.0152221    0.7362766   70.5666667
## [6]   36.5000000
## [1] 2942.000000   42.410492    2.025317    0.734461   69.183007    0.000000
## [1] 3677.0000000   42.4439460    1.9930017    0.7333222   79.2434211
## [6]   56.1538462
## [1] 4412.0000000   42.3597724    2.0702581    0.7402395   80.9533074
## [6]  104.2857143
## [1] 5147.0000000   42.3378093    2.0706755    0.7617852   73.3668342
## [6]   52.1428571
## [1] 5882.0000000   42.3850642    2.0392527    0.7459531   74.3860759
## [6]  156.4285714
## [1] 6617.000000   42.347591    2.057942    0.764737   64.665354   97.333333
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  64.949 secs"
## [1] 2.000000000 0.937916126 0.017131552 0.005870036 0.974632844 0.000000000
## [1] 7.370000e+02 9.365093e-01 1.778042e-02 6.627985e-03 4.765013e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.357766e-01 1.834104e-02 6.800057e-03 1.405648e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 9.364182e-01 1.797209e-02 6.527382e-03 4.602774e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.304741e-01 2.232570e-02 8.117910e-03 1.423927e+00
## [6] 0.000000e+00
## [1] 3.677000e+03 9.292778e-01 2.315344e-02 8.486480e-03 1.910995e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.238639e-01 2.756527e-02 9.488524e-03 2.935657e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 9.264893e-01 2.514220e-02 9.286263e-03 4.790026e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.210457e-01 2.902277e-02 1.084921e-02 2.426862e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.261356e-01 2.556113e-02 9.220999e-03 1.427640e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  58.717 secs"
## [1] 2.00000000 4.73577449 0.18547490 0.05870036 7.60416667 0.00000000
## [1] 737.00000000   4.69113409   0.21200994   0.07680571   8.00645161
## [6]   0.00000000
## [1] 1.472000e+03 4.710380e+00 1.981550e-01 7.141472e-02 8.337563e+00
## [6] 0.000000e+00
## [1] 2.207000e+03 4.696873e+00 2.070172e-01 7.605996e-02 1.249049e+01
## [6] 0.000000e+00
## [1] 2.942000e+03 4.707389e+00 2.004991e-01 7.206145e-02 1.387833e+01
## [6] 2.807692e+01
## [1] 3677.0000000    4.7100353    0.1976881    0.0722263    9.3112245
## [6]    0.0000000
## [1] 4.412000e+03 4.708275e+00 1.988245e-01 7.285007e-02 9.864865e+00
## [6] 0.000000e+00
## [1] 5.147000e+03 4.717344e+00 1.920883e-01 7.051695e-02 1.320413e+01
## [6] 0.000000e+00
## [1] 5.882000e+03 4.705526e+00 2.032337e-01 7.118963e-02 1.222938e+01
## [6] 0.000000e+00
## [1] 6617.0000000    4.6980781    0.2074973    0.0743743    9.3112245
## [6]    0.0000000
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  61.663 secs"
## [1]  2.0000000 42.6368754  1.9463910  0.5870036 74.9337748 28.0769231
## [1] 737.000000  42.157668   2.201757   0.810845  80.309618 104.285714
## [1] 1472.0000000   42.1425733    2.2174143    0.8102824   72.4128686
## [6]   99.5454545
## [1] 2207.0000000   42.0362058    2.2998826    0.8341815   80.1219512
## [6]  136.8750000
## [1] 2942.0000000   42.2479283    2.1432675    0.7790742   74.9104683
## [6]   66.3636364
## [1] 3677.0000000   42.2928851    2.1101388    0.7672461   69.5704698
## [6]   33.1818182
## [1] 4412.0000000   42.2503483    2.1400656    0.7798561   69.8829201
## [6]  162.2222222
## [1] 5147.0000000   42.2801160    2.1204898    0.7696641   84.7233468
## [6]  219.0000000
## [1] 5882.0000000   42.2632834    2.1317238    0.7752628   74.7708895
## [6]   40.5555556
## [1] 6617.0000000   42.3409661    2.0698572    0.7594467   69.9343832
## [6]   26.0714286
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  65.457 secs"
plot.scenario3.run1 <- plothrp2models(pop1 = hrp2map$results[[1]],
                                      pop2 = hrp2map$results[[2]],
                                      pop3 = hrp2map$results[[3]],
                                      labels = c("EIR 1", "EIR 10", "EIR 100")) 
plot.scenario3.run1 <- plot.scenario3.run1 + ggtitle("hrp2 Simulation Scenario 3, Run 1")


plot.scenario3.run2 <- plothrp2models(pop1 = hrp2map$results[[4]],
                                      pop2 = hrp2map$results[[5]],
                                      pop3 = hrp2map$results[[6]],
                                      labels = c("EIR 1", "EIR 10", "EIR 100")) 
plot.scenario3.run2 <- plot.scenario3.run2 + ggtitle("hrp2 Simulation Scenario 3, Run 2")

cowplot::plot_grid(plot.scenario3.run1, plot.scenario3.run2,ncol=1)

Questions for Simulation 3 with regards to hrp2 frequency

  • How do the differences EIR affect hrp2 frequency? __________________
    • Do any of the models seem odd (e.g. are any “overwhelmed”)? _________________________
  • Compare these effects with the effects above (above)? __________________________________
    • Why do you a high EIR results in so few hrp2-deleted parasites? ___________________________
    • Is there a parameter that we could change in our simulation that would allow for a high EIR and still produce a high proportion of hrp2-deleted parasites? _________________________

Incidence, Prediction, and Inference

You’ve now started to answer our Ministry of Health Official’s question, as we have shown that regions of low malaria transmission (low EIR) and high proportion of RDT use are predictive of hrp2-deletion fixation (Watson et al. 2017). In addition, we have shown that a large fitness cost associated with the hrp2-deletion (and a low starting frequency of the hrp2-deletion strain proportion) leads to rapid decay of the hrp2-deleted phenotype (Watson et al. 2017). As a result, we think that these hrp2-deleted parasites are viable and potentially problematic in low-endemicity areas (of note, OJ’s 2017 predictions have been pretty accurate to date, particularly in Eritrea and other regions in Northern Africa).

However, our Ministry of Health Official asks the next logical question: “Should I no longer be using RDTs”?

It is now a good time to note that this question mixes prediction and causal inference, both vast fields of study (the latter being the UNC Epi Department’s forte). We are going to glaze over a huge number of assumptions and caveats (which I apologize for).

Up until this point, we have been modeling the frequency (or proportion) of hrp2 parasites in each population. However, our question now asks about the utility of RDTs in a setting of hrp2 deletions. As such, we will take the “worst” possible scenario and compare two populations in a low EIR setting, where one population only uses RDT (and seeks treatment based on the result) and the other only uses microscopy (and also seeks treatment). We will then focus in on differences in incidence between these two populations.

# I am only going to do one run, but feel free to do more

incidencemap <- data.frame(run = 1,
                           N=2000, 
                           years=20, 
                           strains.0 = 10,
                           rdt.nonadherence = 0, 
                           EIR=1/365, 
                           microscopy.use = c(0,1),
                           rdt.det = 0,
                           fitness = 1)

incidencemap$results <- furrr::future_pmap(incidencemap[,c("N", "years", "strains.0", "rdt.nonadherence", "EIR", "microscopy.use", "rdt.det", "fitness")], 
            hrp2malaRia::hrp2_Simulation)
## [1] 2.000000000 0.938077617 0.016970061 0.005870036 1.380832282 0.000000000
## [1] 7.370000e+02 9.369766e-01 1.754639e-02 6.394711e-03 1.409266e+00
## [6] 0.000000e+00
## [1] 1.472000e+03 9.325998e-01 2.069151e-02 7.626408e-03 9.554974e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.322743e-01 2.108312e-02 7.560305e-03 2.433333e+00
## [6] 0.000000e+00
## [1] 2.942000e+03 9.346138e-01 1.929417e-02 7.009789e-03 9.681698e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.311059e-01 2.189642e-02 7.915389e-03 1.429504e+00
## [6] 0.000000e+00
## [1] 4.412000e+03 9.391340e-01 1.585869e-02 5.925016e-03 9.205549e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.364444e-01 1.789061e-02 6.582744e-03 4.614412e-01
## [6] 0.000000e+00
## [1] 5.882000e+03 9.440127e-01 1.253737e-02 4.367675e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 6.617000e+03 9.387442e-01 1.612428e-02 6.049199e-03 9.287532e-01
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  38.472 secs"
## [1] 2.000000000 0.937969375 0.017078303 0.005870036 0.000000000 0.000000000
## [1] 7.370000e+02 9.451041e-01 1.158966e-02 4.223977e-03 4.424242e-01
## [6] 0.000000e+00
## [1] 1.472000e+03 9.491678e-01 8.318500e-03 3.431447e-03 4.579674e-01
## [6] 0.000000e+00
## [1] 2.207000e+03 9.461855e-01 1.066867e-02 4.063513e-03 4.478528e-01
## [6] 0.000000e+00
## [1] 2.942000e+03 9.416629e-01 1.414417e-02 5.110676e-03 4.445798e-01
## [6] 0.000000e+00
## [1] 3.677000e+03 9.436262e-01 1.263243e-02 4.659044e-03 4.534161e-01
## [6] 0.000000e+00
## [1] 4.412000e+03 9.508927e-01 7.385033e-03 2.639994e-03 9.136421e-01
## [6] 0.000000e+00
## [1] 5.147000e+03 9.493611e-01 8.426921e-03 3.129676e-03 0.000000e+00
## [6] 0.000000e+00
## [1] 5.882000e+03 9.509907e-01 7.156774e-03 2.770212e-03 9.707447e-01
## [6] 0.000000e+00
## [1] 6.617000e+03 9.491489e-01 8.625576e-03 3.143214e-03 0.000000e+00
## [6] 0.000000e+00
## [1] "2000 individuals analysed, over a period of  7350  days at  1  day intervals within  36.998 secs"

Let’s now pull out the incidence data.

ret <- EMSI2019hrp2lab::extracthrp2results(pop1 = incidencemap$results[[1]],
                                           pop2 = incidencemap$results[[2]])

We can open it up to see what we extracted. Note, the pfinc column is from the hrp2malaRia .$S.Incidence column.

glimpse(ret)
## Observations: 480
## Variables: 5
## $ pfinc        <dbl> 0.0004000000, 0.0003500000, 0.0004166667, 0.0004000…
## $ hrp2prev     <dbl> 0.08888555, 0.09215263, 0.09355403, 0.09229883, 0.0…
## $ hrp2prevmono <dbl> 0.03331232, 0.03490653, 0.04072324, 0.05271786, 0.0…
## $ time         <dbl> 30, 60, 91, 121, 152, 183, 213, 244, 275, 305, 336,…
## $ pop          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
ret %>% 
  ggplot(.) +
  geom_line(aes(x=time, y=pfinc, group = factor(pop), colour=factor(pop))) +
  xlab("Times (Days)") + ylab("Incidence") + 
  scale_color_manual(name = "", labels = c("Only RDT use", "Only Microscopy Use"), values = RColorBrewer::brewer.pal(8, "Set1")) +
    theme_minimal() +
    theme(
      plot.title = element_text(hjust=0.5),
      legend.position = "bottom"
    )

Preliminary Questions with regards to Incidence

  • Why is the incidence (through time) for both populations relatively low? How could you increase the incidence dramatically (by changing one parameter in the model code above)? _______________________
  • Why are there fluctuations in incidence through time? ________________

To answer this question, it may help to subset to just a year of “time”. You could do this by adding the following to the code chunk above:
ret %>% dplyr::filter(time < 365) %>% # this would be the first year ggplot(.) + ...

Without going into too much detail, incidence is indexed by time (this is an “open population”“, so we can’t use”risk“). As a result, we can model the incidence difference between these two populations at each time-point in order to get a sense of how many more infections the”Only RDT Use" population has as compared to “Only Microscopy Use”.

days <- seq(0, (365*20))

pop1 <- ret %>% 
  dplyr::filter(pop == 1)

pop2 <- ret %>% 
  dplyr::filter(pop == 2)

# quick local linear interpolation to get an incidence for every single day of the 20 years (not just the S.times)
pop1 <- approx(x = pop1$time, y = pop1$pfinc, xout = days, method = "linear") %>% 
  cbind.data.frame() %>% 
  dplyr::rename(time = x, pfinc = y) %>% 
  dplyr::mutate(pop = 1)

pop2 <- approx(x = pop2$time, y = pop2$pfinc, xout = days, method = "linear") %>% 
  cbind.data.frame() %>% 
  dplyr::rename(time = x, pfinc = y) %>% 
  dplyr::mutate(pop = 2)

# now let's rejoin and plot the difference

left_join(pop1, pop2, by = "time") %>% 
  dplyr::mutate(diff = pfinc.x - pfinc.y) %>% 
  dplyr::filter(!is.na(diff)) %>% # get rid of boundary issues ("cheating, we could do better")
  ggplot() +
  geom_line(aes(x=time, y=diff), color = "blue") + 
  geom_hline(yintercept = 0, color = "red") +
  xlab("Times (Days)") + ylab("Incidence") + 
  ggtitle("Incidence Difference between a Population Using only RDTs versus Microscopy") +
    theme_minimal() +
    theme(
      plot.title = element_text(hjust=0.5),
      legend.position = "bottom"
    )

Questions with regards to Incidence Plot

  • What is your interpretation of the Incidence Difference Plot Above? _______________________
  • Can we say these are the cases that are due to RDT use vs Microscopy Use? ________________

On your own, alter the simulation (e.g. vary EIR, etc.). How does the incidence difference plot change? What should you tell your Ministry of Health Official friend?

One thing to consider is the “cost” of a missed infection (e.g. an individual with hrp2 deleted parasites that was misdiagnosed by RDT, a “false-negative”), which is not reflected in the incidence plot above. Let’s look at that below (under the assumption that no cases were missed by microscopy under our model).

ret %>% 
  dplyr::filter(pop == 1) %>% # assume microscpy had no missed cases
  dplyr::mutate(missedprop = pfinc * hrp2prevmono) %>% 
  ggplot() +
  geom_line(aes(x=time, y=missedprop), color = "blue") + 
  xlab("Times (Days)") + ylab("hrp2-only Incidence") + 
  ggtitle("Incidence of hrp2 only infected individuals (missed case proportion)") +
    theme_minimal() +
    theme(
      plot.title = element_text(hjust=0.5),
      legend.position = "bottom"
    )

Now taking into account the “cost” of a potential missed infection, does your response to our Ministry of Health friend change?

End

Thank you for completing this lab!!